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Lanier - A Ricoh Company » Production Workflow 

MicroPress® Document Production System. The MicroPress system 
combines advanced job ... Now, operators can manage more print jobs 
with less staff, ... 

www.lanier.com/page.php/production%20workflow - 22k - 
Cached - Similar pages 

Adobe JobReady - Partners 

Print job turnarounds are improved while digital document production 

costs are lowered. Ideal for facilities management sites and print centers 
within ... 

www.adobe.com/products/pdfjobready/partners.html - 62k - 
Cached - Similar pages 

Canon Business Solutions - Document Mastering 

The ImageSmart Document Mastering Suite is a software solution that 
enhances ... The ability to preview print jobs with finishing settings allows 
for quick ... 

www.solutions.canon.com/doc_mastering.aspx - 13k - 
Cached - Similar pages 

Gestetner - Desk Top Editor for Production Overview 

Desk Top Editor for Production makes it easy to manage print jobs. 
Simply "Drag & Drop" documents to combine a proposal, comparison 
chart, presentation and ... 

www.gestetnerusa.com/Gestetner/gestetner_comV4.nsf/ 
(AII)/DeskTopEditorforProductionOverview.html - 15k- 
Cached - Similar pages 

[pdf] IBM Visual Job Ticketing 

File Format: PDF/Adobe Acrobat - View as HTML 
production intent of the document; it. can then be used whenever the job 
is. printed, even if the print job is routed, to a different device. 
Conversely, ... 

www.printers.ibm.com/internet/comnelit.nsf/ Files/G563-0799-00/ 
$File/G563-0799-00.pdf?OpenElement&site= - Similar pages 

Xerox - Document Management System with Freeflow Software and ... 

FreeFlow Output Manager improves your digital print production capacity, turnaround and 
productivity by centralizing control of print job management across ... 
www.xerox.com/... / FreeFlow%20Digital%20Workflow&Xcntry=USA&Xlang=en_US - 64k - 
Cached - Similar pages 

Xerox - Document Centre™ 470 ST 

Production Equipment. For high-volume printing and specialized applications ... Walk-up 
users can scan a copy job while the Document Centre 470 is printing ... 

www.xerox.com/go/xrx/equipment/product_ details.jsp? 
prodlD=DC470ST&Xcntry=USA&Xlang=enJJS - 53k - Cached - Similar_pag.es 



Sponsored Links 

Printing Job Listings 

Printing Job, Printer, Typesetter 
Print Production, Operator Employ 
www.iHirePrinting.com 

Print Production Jobs 

Search for jobs, post your resume 
& find the job you're looking for. 
www.monster.com 

Print Production 

Print Production info 
from several different companies. 
BusinessChambers.com 

Employment Form 

1000s of forms. Updated daily. 
Fast, Easy, Free, Safe & No Spam! 
www.xdrive.com 

printing jobs 

Find printing jobs now 
500,000+ manufacturers online 
www.SourceTool.com 

Printing jobs 

Find Exactly what you are looking 
for at Webled - Printing jobs 
Webled.com 
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Production Printing | Document Management & Distribution | Forms ... In busy 
production print environments, PageScope Job Spooler can be the key to greater ... 
kmbs.konicaminolta.us/eprise/main/ KMBS/Solutions/Suites/PageScopeSoftwareSuite - 40k 
- Cached - Similar pag es 

EFI - News - Press Releases 

Canon USA And EFI Expand Graphics Document Production And Color Control With 
New ... Digital StoreFront is a next-generation print job submission and ... 
www.ir.efi.com/phoenix.zhtml?c=117454& p=irol-newsArticle&ID=670813&highlight= - 21k - 
Cached - Similar pag es 

[pdf] Establishing a Secure Document Production Environment 

File Format: PDF/Adobe Acrobat - View as HTML 

establishing a security strategy is the document production environment. ... track data and 
documents. Print, copy, fax and scan jobs are tracked with ... 
www.copiers.toshiba.com/ whatsnew/security_whitepaper.pdf - Similar p a ges 
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Online Digital Document Submission for Print Delivers! 

WebCRD enables direct submission of print jobs to all networked 
production printers without error-prone re-keying of information by print 
center staff. ... 

outputlinks.com/html/General/news-00846.shtml - 8k - 
Cached - Similar pages 

EFI - News - Press Releases 

The Fiery Production Printing Package delivers functionality exclusively 
provided by EFI. The Package is a set of advanced job management and 
submission ... 

www.ir.efi.com/phoenix.zhtml?c=117454& p=irol- 
newsArticle&ID=748729&highlight= - 1 9k - Cached - Similar pages 

Konica Minolta Business Solutions, USA, Inc. 

Production Printing | Document Management & Distribution | Forms & 
Variable ... Create make-ready layouts of print jobs before sending them 
to the printer. ... 

kmbs.konicaminolta.us/eprise/main/ 
KMBS/Solutions/Categories/ProductionPrinting - 36k - 
Cached - Similar pages 

DocuLex - A Recognized Leader In Document Imag in g 
Software 

Printed images may contain production numbers and can be identified 
according to document unitization. IPStudio enables the user with 
printing capabilities ... 

www.doculex.com/ipstudio.asp - 42k - Cached - Similar pages 

Canon Business Solutions - Solutions 

In today's competitive POD marketplace, many print service providers and 
in-house CRDs are looking for ways to automate the document 
production process, ... 

www.solutions.canon.com/solutions.aspx - 22k - Cached - Si milar pages 
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Printing Job Listings 

Printing Job, Printer, Typesetter 
Print Production, Operator Employ 
www.iHirePrinting.com 

Print Production Jobs 

Search for jobs, post your resume 
& find the job you're looking for. 
www.monster.com 

Print Production 

Free tips and information about 
Print Production online. 
BusinessChambers.com 

Employment Form 

View & print anytime, anywhere! 
1000s of forms. Fast, easy & free. 
www.xdrive.com 

printing jobs 

Find printing jobs now 
500,000+ manufacturers online 
www.SourceTool.com 

Printing jobs 

Find Exactly what you are looking 
for at Webled - Printing jobs 
Webled.com 



Unicorn Enterprises SA 

Generally, the print server handles all document production instructions in the device- 
independent way as job attributes. When the document is about to be ... 
www.unicorn-enterprises.com/uxps_overview.html - 24k - Cached - Similar pa ges 



IBM displays leading automated print technologies at AIIM ON .„ 

IBM plans to showcase IBM Infoprint Workflow, an end-to-end automated document factory 
solution and the infrastructure for production print and mail ... 

www.printers.ibm.com/internet/ wwsites.nsf/vwwebpublished/workflow051705pr_ww - 23k - 
Cached - Similar pag es 

Speed up data traffic and dynamic document production 

Speed up data traffic and dynamic document production. ... Uncertainties about how a job 
is going to print not only waste the designers' and programmers' ... 
www.redtitan.com/view.htm - 1 0k - Cached - Similar pa g es 
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Multiple job entry points for document production control and — 

As is well known the print job and parameter database 26 contains the necessary 
parameters to enable printed generation of documents (eg, ... 
www.freepatentsonline.com/6278988.html - 34k - Cached - Similar pa ges 

Adobe Systems Incorporated 

For resellers, developers, solutions providers, ISVs, print service providers, ... Publish, 
share, review, and mark up 3D designs in Intelligent Documents. ... 

www.adobe.com/ - 29k - Cached - Similar pages 
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Technique for automatically correcting words in text 

Karen Kukich 

December 1992 ACM Computing Surveys (CSUR), volume 24 issue 4 
Publisher: ACM Press 

Full text available* t 51 ) pdf(6 23 MB) Additional Information: full citation , abstract , references , citings , index 

terms , review 

Research aimed at correcting words in text has focused on three progressively more 
difficult problems:(l) nonword error detection; (2) isolated-word error correction; and (3) 
context-dependent work correction. In response to the first problem, efficient pattern- 
matching and n-gram analysis techniques have been developed for detecting strings that 
do not appear in a given word list. In response to the second problem, a variety of 
general and application-specific spelling cor ... 

Keywords: n-gram analysis, Optical Character Recognition (OCR), context-dependent 
spelling correction, grammar checking, natural-language-processing models, neural net 
classifiers, spell checking, spelling error detection, spelling error patterns, statistical- 
language models, word recognition and correction 





Computing curricula 2001 

September 2001 Journal on Educational Resources in Computing (JERIC) 
Publisher: ACM Press 

Full text available: fiCI pdf(613.63 KB) 

fe? L ,/o -™ ,/m Additional Information: full citation , references , cit ings , index terms 
html(2.78 KB) *~ 





Technical reports 

SIGACT News Staff 

January 1980 ACM SIGACT News, Volume 12 issue l 
Publisher: ACM Press 

Full text available: ^ pdf(5.28 MB) Additional Information: full citation 
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Early user— system interaction for database selection in massive domain-specific 
online environments 

Jack G. Conrad, Joanne R. S. Claussen 

January 2003 ACM Transactions on Information Systems (TOIS), volume 21 issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(845,54 KB) Additional Information: full citation , abstract , references , index terms 

The continued growth of very large data environments such as Westlaw and Dialog, in 
addition to the World Wide Web, increases the importance of effective and efficient 
database selection and searching. Current research focuses largely on completely 
autonomous and automatic selection, searching, and results merging in distributed 
environments. This fully automatic approach has significant deficiencies, including reliance 
upon thresholds below which databases with relevant documents are not search ... 

Keywords: Database selection, metadata for retrieval, structuring information to aid 
search and navigation, user interaction 




Efficient passage ranking for document databases 

Marcin Kaszkiel, Justin Zobel, Ron Sacks-Davis 

October 1999 ACM Transactions on Information Systems (TOIS), volume 17 issue 4 
Publisher: ACM Press 

Full text available* f£\ pdf(328 98 KB) Additional Information: full citation , abstract , references , citings , index 

. : terms 

Queries to text collections are resolved by ranking the documents in the collection and 
returning the highest-scoring documents to the user. An alternative retrieval method is to 
rank passages, that is, short fragments of documents, a strategy that can improve 
effectiveness and identify relevant material in documents that are too large for users to 
consider as a whole. However, ranking of passages can considerably increase retrieval 
costs. In this article we explore alternative query evalua ... 

Keywords: inverted files, passage retrieval, query evaluation, text databases, text 
retrieval 




6 Selected IR-Related Dissertation Abstracts 

February 1992 ACM SIGIR Forum, Volume 26 Issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(2.24 MB) Additional Information: full citation 




7 S elected I R-R elated Dissertation Abstracts 

March 1993 ACM SIGIR Forum, Volume 27 Issue 1 
Publisher: ACM Press 

Full text available: ffl pdf(2.24 MB) Additional Information: full citation, abstract 




The following are citations selected by title and abstract as being related to Information 
Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of 
the Dissertation Abstracts Online database produced by University Microfilms International 
(UMI). Included are UMI order number, title, author, degree, year, institution; number of 
pages, and abstract. Unless otherwise specified, paper or microform copies of 
dissertations may be ordered from University Microfilms Inter ... 
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8 Image Retrieval from the World Wide Web: Issues, Techniques, and Systems 




M. L. Kherfi, D. Ziou, A. Bernardi 

March 2004 ACM Computing Surveys (CSUR), Volume 36 issue l 
Publisher: ACM Press 

Full text available: ^ pdf(294.13 KB) Additional Information: full citation , abstract , references , index terms 

With the explosive growth of the World Wide Web, the public is gaining access to massive 
amounts of information. However, locating needed and relevant information remains a 
difficult task, whether the information is textual or visual. Text search engines have 
existed for some years now and have achieved a certain degree of success. However, 
despite the large number of images available on the Web, image search engines are still 
rare. In this article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 



9 Facial modeling and animation 

^ Jorg Haber, Demetri Terzopoulos 

>^ August 2004 Proceedings of the conference on SIGGRAPH 2004 course notes GRAPH 

'04 

Publisher: ACM Press 

Full text available: ^pdf(1 8 .15 MB ) Additional Information: f ull c itation, abstract 

In this course we present an overview of the concepts and current techniques in facial 
modeling and animation. We introduce this research area by its history and applications. 
As a necessary prerequisite for facial modeling, data acquisition is discussed in detail. We 
describe basic concepts of facial animation and present different approaches including 
parametric models, performance-, physics-, and learning-based methods. State-of-the-art 
techniques such as muscle-based facial animation, mass-s ... 

10 Fast detection of communication patterns in distributed executions 

Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 

Studies on Collaborative research 
Publisher: IBM Press 

Full text available: ^ pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 

11 Multikey access methods based on superimposed codin g techniq ues 

R. Sacks-Davis, A. Kent, K. Ramamohanarao 

November 1987 ACM Transactions on Database Systems (TODS), Volume 12 issue 4 
Publisher: ACM Press 

Full text available- fg| pdf(371 MB) Additional Information: full citation , abstract, references , citings, index 

terms , review 

Both single-level and two-level indexed descriptor schemes for multikey retrieval are 
presented and compared. The descriptors are formed using superimposed coding 
techniques and stored using a bit-inversion technique. A fast-batch insertion algorithm for 
which the cost of forming the bit-inverted file is less than one disk access per record is 
presented. For large data files, it is shown that the two-level implementation is generally 
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more efficient for queries with a small number of matchin ... 

12 Burst tries: a fast , efficient data structure for string key s 

April 2002 ACM Transactions on Information Systems (TOIS), volume 20 issue 2 
Publisher: ACM Press 

Full text available: f £) P df(324.84 KB) Additional Information: full citation , abstract, references , citings, index 

^ terms , review 

Many applications depend on efficient management of large sets of distinct strings in 
memory. For example, during index construction for text databases a record is held for 
each distinct word in the text, containing the word itself and information such as counters. 
We propose a new data structure, the burst trie, that has significant advantages over 
existing options for such applications: it uses about the same memory as a binary search 
tree; it is as fast as a trie; and, while not as fast as a ... 

Keywords: Binary trees, splay trees, string data structures, text databases, tries, 
vocabulary accumulation 




13 Information systems security design methods: implications for information systems 
<|k development 

^ Richard Baskerville 

December 1993 ACM Computing Surveys (CSUR), Volume 25 Issue 4 

Publisher: ACM Press 

Full text available: fg |pdf(3.44 MB) Additional Information: full citation , abstract, refexejices, citings, index 

terms 

The security of information systems is a serious issue because computer abuse is 
increasing. It is important, therefore, that systems analysts and designers develop 
expertise in methods for specifying information systems security. The characteristics 
found in three generations of general information system design methods provide a 
framework for comparing and understanding current security design methods. These 
methods include approaches that use checklists of controls, divide functional req ... 

Keywords: checklists, control, integrity, risk analysis, safety, structured systems analysis 
and design, system modeling 




14 Digital li braries a nd cyberinfas tr uctur e track: creating information representat ions for Q 
|k the humanities (part 1): Finding a catalog: g eneratin g analytical catalo g re cords from 
well-structured digital texts 

David Mimno, Alison Jones, Gregory Crane 

June 2005 Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^ pdf(2 10.23 KB) Additional Information: full citation , abstract , references , index terms 

One of the criticisms library users often make of catalogs is that they rarely include 
information below the bibliographic level. It is generally impossible to search a catalog for 
the titles and subjects of particular chapters or volumes. There has been no way to add 
this information to catalog records without exponentially increasing the workload of 
catalogers. At the same time, well -structured full-text XML transcriptions of printed works 
are becoming increasingly available. This paper descri ... 

Keywords: analytical cataloging, information extraction, library automation 
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15 Self-indexing inverted files for fast text retrieval 

Alistair Moffat, Justin Zobel 

October 1996 ACM Transactions on Information Systems (TOIS), Volume 14 issue 4 
Publisher: ACM Press 

Full text available: ffi pdf(484.52 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

Query-processing costs on large text databases are dominated by the need to retrieve 
and scan the inverted list of each query term. Retrieval time for inverted lists can be 
greatly reduced by the use of compression, but this adds to the CPU time required. Here 
we show that the CPU component of query response time for conjunctive Boolean queries 
and for informal ranked queries can be similarly reduced, at little cost in terms of storage, 
by the inclusion of an internal index in each compress ... 

16 Link and channel measurement: A simple mechanism for capturing and replaying 

wireless channels 
Glenn Judd, Peter Steenkiste 

August 2005 Proceeding of the 2005 ACM SIGCOMM workshop on Experimental 

approaches to wireless network design and analysis E-WIND '05 

Publisher: ACM Press 

Full text available:^ pdf(6.06 MB) Additional Information: full citation , abstract , references , index terms 

Physical layer wireless network emulation has the potential to be a powerful experimental 
tool. An important challenge in physical emulation, and traditional simulation, is to 
accurately model the wireless channel. In this paper we examine the possibility of using 
on-card signal strength measurements to capture wireless channel traces. A key 
advantage of this approach is the simplicity and ubiquity with which these measurements 
can be obtained since virtually all wireless devices provide the req ... 

Keywords: channel capture, emulation, wireless 






17 The paraphrase search assistant: terminological feedback for iterative information 
seeking 

Peter G. Anick, Suresh Tipirneni 

August 1999 Proceedings of the 22nd annual international ACM SIGIR conference on 

Research and development in information retrieval 

Publisher: ACM Press 

Full text available: ^ pdfd 16 .8 9 KB ) Additional Information: full citation , references , citings , index terms 



Keywords: data visualization, query reformulation, terminological feedback 




18 Level set and PDE methods for computer graphics 

David Breen, Ron Fedkiw, Ken Museth, Stanley Osher, Guillermo Sapiro, Ross Whitaker 
August 2004 Proceedings of the conference on SIGGRAPH 2004 course notes GRAPH 

'04 

Publisher: ACM Press 

Full text available: ^ pdf(17.07 MB) Additional Information: full citation, abstract 

Level set methods, an important class of partial differential equation (PDE) methods, 
define dynamic surfaces implicitly as the level set (iso-surface) of a sampled, evolving nD 
function. The course begins with preparatory material that introduces the concept of using 
partial differential equations to solve problems in computer graphics, geometric modeling 
and computer vision. This will include the structure and behavior of several different types 
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of differential equations, e.g. the level set eq ... 

19 Enhanced hypertext categorization using hyperlinks 

Soumen Chakrabarti, Byron Dom, Piotr Indyk 

June 1998 ACM SIGMOD Record , Proceedings of the 1998 ACM SZGMOD international 

conference on Management of data SIGMOD '98, volume 27 issue 2 
Publisher: ACM Press 

Full text available- fjQ pdf(1 91 MB) Additional Information: full citation , abstract , references , citings, index 

^ * terms 

A major challenge in indexing unstructured hypertext databases is to automatically 
extract meta-data that enables structured search using topic taxonomies, circumvents 
keyword ambiguity, and improves the quality of search and profile-based routing and 
filtering. Therefore, an accurate classifier is an essential component of a hypertext 
database. Hyperlinks pose new problems not addressed in the extensive text classification 
literature. Links clearly contain high-quality semantic clues that ... 

20 Information retrieval session 8: efficiency: Online duplicate document detection: 
signature r eliabil i ty in a dynamic retrieval environment 

Jack G. Conrad, Xi S. Guo, Cindy P. Schriber 

November 2003 Proceedings of the twelfth international conference on Information 

and knowledge management 

Publisher: ACM Press 

Full text available: fg ) pdf(215.37 KB) Additlonal Information: full citation, abstract , references, cjtings, index 

terms 

As online document collections continue to expand, both on the Web and in proprietary 
environments, the need for duplicate detection becomes more critical. Few users wish to 
retrieve search results consisting of sets of duplicate documents, whether identical 
duplicates or close matches. Our goal in this work is to investigate the phenomenon and 
determine one or more approaches that minimize its impact on search results. Recent 
work has focused on using some form of signature to characterize a do ... 

Keywords: data management, doc signatures, duplicate document detection 
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ZBroker: a query routing broker for Z39.50 databases 
Yong Lin, Jian Xu, Ee-Peng Lim, Wee-Keong Ng 

November 1999 Proceedings of the eighth international conference on Information 

and knowledge management 

Publisher: ACM Press 

Full text available- *f £)pdf(1.15 MB) Additional Information: full citation, abstract, re ferences , citings, index 

terms 

A query routing broker is a software agent that determines from a large set of accessing 
information sources the ones most relevant to a user's information need. As the number 
of information sources on the Internet increases dramatically, future users will have to 
rely on query routing brokers to decide a small number of information sources to query 
without incurring too much query processing overheads. In this paper, we describe a 
query routing broker known as ZBroker developed for bibliog ... 





2 Harp: a distributed query system for legacy public libraries and structured databases 

Ee-Peng Lim, Ying Lu 

July 1999 ACM Transactions on Information Systems (TOIS), volume 17 issue 3 
Publisher: ACM Press 

Full text available* fill pdf(196 58 KB). M( *\\iona\ information: full citation , abstract , references , citings , index 

- terms , review 

The main purpose of a digital library is to facilitate users easy access to enormous amount 
of globally networked information. Typically, this information includes preexisting public 
library catalog data, digitized document collections, and other databases. In this article, 
we describe the distributed query system of a digital library prototype system known as 
HARP. In the HARP project, we have designed and implemented a distributed query 
processor and its query front-end to support integr ... 




Keywords: Internet databases, digital libraries, interoperable databases 
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3 User preferences when searching individual and integrated full-text databases 

^j|v Soyeon Park 

V August 1999 Proceedings of the fourth ACM conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^| pdf(1 58.43 KB) Additional Information: full citation , references , citings , index terms 
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4 An integrated approach to documentation retrieval using a spires database 

Suzanne Schluederberg 

October 1988 Proceedings of the 6th annual international conference on Systems 

documentation 
Publisher: ACM Press 

Full text available: fiQ pdf(553.02 KB) Additional Information: full citation , index terms 




5 Database theory, technology and applications (DTTA): Simplified access to 
<g> structured databases by adapting keyword search and database selection 

^ Mohammad Hassan, Reda Alhajj, Mick J. Ridley, Ken Barker 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 

Publisher: ACM Press 

Full text available: ^ pdf(219.19 KB) Additional Information: full citation , abstract , references , index terms 

This paper presents a tool that enables non-technical (naive) end-users to use free-form 
queries in exploring distributed relational databases with simple and direct technique, in a 
fashion similar to using search engines to search text files on the web. This allows web 
designers and database developers to publish their databases for web browsers exploring. 
The proposed approach can be used for both Internet and Intranet application areas. Our 
approach depends on identifying first databases that ... 

Keywords: database selection, information retrieval, keyword search, relational 
databases 





6 Document q uerying and transformation: Fast structural query with application to 
Chinese treebank sentence retrieval 

Chia-Hsin Huang, Tyng-Ruey Chuang, Hahn-Ming Lee 

October 2004 Proceedings of the 2004 ACM symposium on Document engineering 
Publisher: ACM Press 

Full text available: pdf(475.51 KB) Additional information: full citation , abstract , references , index terms 





In natural language processing a huge amount of structured data is constantly used for 
the extraction and presentation of grammatical structures in sentences. For example the 
Chinese Treebank corpus developed at the Institute of Information Science Academia 
Sinica Taiwan is a semantically annotated corpus that has been used to help parse and 
study Chinese sentences. In this setting users usually use structured tree patterns instead 
of keywords to query the corpus. 

In this paper we pres ... 

Keywords: XML, structural query, treebank 
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March 1990 ACM SIGOIS Bulletin , Proceedings of the conference on Office 
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Full text available: ^ pdfd.24 MB) Additional Information: full citation , abstract , references , citings, index 

terms 

One of the main component of integrated office systems is the large central filing system. 
It efficiently stores, retrieves and searches office documents containing text, images, 
graphics, data and voice. We propose to implement a filing system on top of the 
Darmstadt database system (DASDBS), which is designed as a data management kernel 
for both standard and non-standard applications. This paper investigates the choice of 
appropriate storage structures for the filing system objects and th ... 

8 An evaluation of retrieval effectiveness for a full-text document-retrieval system 

David C. Blair, M. E. Maron 

March 1985 Communications of the ACM, Volume 28 issue 3 
Publisher: ACM Press 

Full text available 1 fi?|pdf( 1 16MB) Additional Information: full citation , abstract , references , citings, index 

[£j terms , review 

An evaluation of a large, operational full-text document-retrieval system (containing 
roughly 350,000 pages of text) shows the system to be retrieving less than 20 percent of 
the documents relevant to a particular search. The findings are discussed in terms of the 
theory and practice of full-text document retrieval. 

9 Interactive term suggestion for users of digital libraries: using subject thesauri and co 
occurrence lists for information retrieval 



Bruce R. Schatz, Eric H. Johnson, Pauline A. Cochrane, Hsinchun Chen 

April 1996 Proceedings of the first ACM international conference on Digital libraries 

Publisher: ACM Press 

Full text available:^ pdf(974.58 KB) Additional Information: full citation, references , citing s, index terms 







10 Selected IR-Related Dissertation Abstracts 

May 1991 ACM SIGIR Forum, Volume 25 Issue 1 
Publisher: ACM Press 
Full text available: ^|pdf(2.71 MB) Additional Information: full citation , abstract 

The following are citations selected by title and abstract as being related to Information 
Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of 
the Dissertation Abstracts Online database produced by University Microfilms International 
(UMI). Included are UMI order number, title, author, degree, year, institution; number of 
pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen 
by the author, and abstract. Unless otherwise spec ... 

1 1 A clustered search algorith m i ncorporating arbitrary term depe ndencies 

K. Lam, C. T. Yu 

September 1982 ACM Transactions on Database Systems (TODS), volume 7 issue 3 
Publisher: ACM Press 

Full text available: ITI pdf(669.49 KB) Add'*' 0 ' 131 Information: full citation , abstract, references, citings , index 

terms 

The documents in a database are organized into clusters, where each cluster contains 
similar documents and a representative of these documents. A user query is compared 
with all the representatives of the clusters, and on the basis of such comparisons, those 
clusters having many close neighbors with respect to the query are selected for searching. 
This paper presents an estimation of the number of close neighbors in a cluster in relation 
to the given query. The estimation t ... 
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& area.) 

^ Gerald Salton 

April 1989 ACM SIGIR Forum, Volume 23 issue 3-4 

Publisher: ACM Press 

Full text available: ^ pdf(1.09 MB) Additional Information: full citation 



13 Design of an OPAC database to permit different subject searching accesses in a 
^ multi-disciplines universities library catalo g ue database 

^ Maristella Agosti, Maurizio Masotti 

June 1992 Proceedings of the 15th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available: ^ [ pdf(877.96 KB) Additional Information: full citation , abstract , references , index terms 

This paper presents searching approaches and user interface capabilities of DUO, an 
Online Public Access Catalogue (OPAC) designed to permit the users of three Universities 
of the Northeast of Italy different subject searching accesses to the co-operative multi- 
disciplines library catalogue database. The co-operative catalogue database is managed 
by one of the software systems developed under the italian national project for library 
automation: the SBN project. Since the SBN data ... 

14 Multiple search engines in database merging 

Ellen M. Voorhees, Richard M. Tong 

July 1997 Proceedings of the second ACM international conference on Digital 

libraries 
Publisher: ACM Press 

Full text available: ^]pdf( 1.52 MB) Additional Information: full citation , references, citings, index terms 







15 On the encipherment of search trees and random access files 

R. Bayer, J. K. Metzger 

March 1976 ACM Transactions on Database Systems (TODS), volume l issue l 
Publisher: ACM Press 

Full text available: t ^l pdf(1.30 MB) Additional Information: full citation , abstrac t, reference s, citings, index 

" terms 

The securing of information in indexed, random access files by means of privacy 
transformations must be considered as a problem distinct from that for sequential files. 
Not only must processing overhead due to encrypting be considered, but also threats to 
encipherment arising from updating and the file structure itself must be countered. A 
general encipherment scheme is proposed for files maintained in a paged structure in 
secondary storage. This is applied to the encipherment of indexes or ... 

Keywords: B-trees, cryptography, encipherment, indexed sequential files, indexes, 
paging, privacy, privacy transformation, protection, random access files, search trees, 
security 
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16 A browser for bibliographic information retrieval, based on an application of lattice 
<g> theory 

^ Gert Schmeltz Pedersen 

July 1993 Proceedings of the 16th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available: ffl Pdff995.71 KB) Additional Information: full citation , abstract, references , citings, index 

10 terms 

An application of mathematical lattice theory, called relationship lattices, is utilized to 
attack problems of operational bibliographic information retrieval. The proposed solution 
offers an interface to the information searcher enabling operation in a world of concepts, 
authors, and document records and their relationships. This hides the complexities of 
query language and database structures, and it allows to use a personally preferred 
terminology and to browse, query and down ... 

17 The design of a document database 

Chris Clifton, Hector Garcie-Molina 

January 2000 Proceedings of the ACM conference on Document processing systems 
Publisher: ACM Press 

Full text available: ^ pdf(758.62 KB) Additional Information: full citation , references , citings , index terms 







18 Generation and search of clustered files 

G. Salton, A. Wong 

December 1978 ACM Transactions on Database Systems (TODS), volume 3 issue 4 
Publisher: ACM Press 

Full text available: "Si pdf(1 .78 MB). Add't'ona' Information: full citation , abstract , references , citings , index 

^ ' terms 

A classified, or clustered file is one where related, or similar records are grouped into 
classes, or clusters of items in such a way that all items within a cluster are jointly 
retrievable. Clustered files are easily adapted to broad and narrow search strategies, and 
simple file updating methods are available. An inexpensive file clustering method 
applicable to large files is given together with appropriate file search methods. An abstract 
model is then introduced to predict the retrieval ... 

Keywords: automatic classification, cluster searching, clustered files, fast classification, 
file organization, probabilistic models 





19 XRel: a path-based approach to storage and retrieval of XML documents using 
relational databases 

August 2001 ACM Transactions on Internet Technology (TOIT), volume l issue l 
Publisher: ACM Press 

Full text available- fjQ pdf(264.27 KB) Additional Information: full citation , abstract , references , citings , index 

^ terms , review 

This article describes XRel, a novel approach for storage and retrieval of XML documents 
using relational databases. In this approach, an XML document is decomposed into nodes 
on the basis of its tree structure and stored in relational tables according to the node 
type, with path information from the root to each node. XRel enables us to store XML 
documents using a fixed relational schema without any information about DTDs and also 
to utilize indices such as the B+ 

Keywords: XML query, XPath, text markup, text tagging 
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20 Probabilistic and g enetic algorithms in document retrieval 

M. Gordon 





October 1988 Communications of the ACM, volume 3i issue 10 
Publisher: ACM Press 

Full text available* f£| pdf(1.27 MB) Additional Information: full citation , abstract , references , citings , index 

• terms 

Document retrieval systems are built to provide inquirers with computerized access to 
relevant documents. Such systems often miss many relevant documents while falsely 
identifying many non-relevant documents. Here, competing document descriptions are 
associated with a document and altered over time by a genetic algorithm according to the 
queries used and relevance judgments made during retrieval. 
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1 Fast detection of communication patterns in distributed executions 

Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced 

Studies on Collaborative research 

Publisher: IBM Press 

Full text available:^ pdf (4.21 MB ) Additional Information: full citation, abstract , references , index terms 




Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 




Visualizing search results: some alternatives to query-document similarity 

Lucy Terry Nowell, Robert K. France, Deborah Hix, Lenwood S. Heath, Edward A. Fox 
August 1996 Proceedings of the 19th annual international ACM SIGIR conference on 

Research and development in information retrieval 
Publisher: ACM Press 

Full text available: ^ pdf(1.80 MB) Additional Information: full citation, references , citings , index terms 





3 Document Examiner: delivery interface for hypertext documents 

Janet H. Walker 

November 1987 Proceeding of the ACM conference on Hypertext 
Publisher: ACM Press 

Full text available: f?) pdf(1.28 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

This paper describes the user Interface strategy of Document Examiner, a delivery 
interface for commercial hypertext documents. Unlike many hypertext interfaces, 
Document Examiner does not adopt the directed graph as its fundamental user-visible 
navigation model. Instead it offers context evaluation and content-based searching 
capabilities that are based on consideration of the strategies that people use in interacting 
with paper documents. 

Interactive term suggestion for users of digital libraries: using subject thesauri and co- 
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occurrence lists for information retrieval 

Bruce R. Schatz, Eric H. Johnson, Pauline A. Cochrane, Hsinchun Chen 

April 1996 Proceedings of the first ACM international conference on Digital libraries 

Publisher: ACM Press 

Full text available: ^ pdf(974.58 KB) Additional Information: full citation , references , citings , index terms 



On the measurement of inter-linker consistency and retrieval effectiveness in 
hypertext databases 

David Ellis, Jonathan Furner-Hines, Peter Willett 

August 1994 Proceedings of the 17th annual international ACM SIGIR conference on 

Research and development in information retrieval 

Publisher: Springer-Verlag New York, Inc. 
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Building searchable collections of enterprise speech data 

James W. Cooper, Mahesh Viswanathan, Donna Byron, Margaret Chan 

January 2001 Proceedings of the 1st ACM/IEEE-CS joint conference on Digital 

libraries 
Publisher: ACM Press 

Full text available:^) pdf(356.53 KB) Additional Information: full citation , abstract , references , index terms 

We have applied speech recognition and text-mining technologies to a set of recorded 
outbound marketing calls and analyzed the results. Since speaker-independent speech 
recognition technology results in a significantly lower recognition rate than that found 
when the recognizer is trained for a particular speaker, we applied a number of post- 
processing algorithms to the output of the recognizer to render it suitable for the Textract 
text mining system. We indexed the call transcri ... 

Keywords: document display, search, speech analysis, speech retrieval, text mining 



7 P1: "Yes, but does it scale?": practical considerations for database-driven information 
s ystems 

^ John Russell 

October 2001 Proceedings of the 19th annual international conference on Computer 

documentation 

Publisher: ACM Press 

Full text available* If) pdf(231 .31 KB) Add ' tional Information: full citation , abstract , references , citings , index 

terms 

This paper explores the process of designing and implementing a database-driven system 
of online documentation, and putting it live on the web for customers to use. Using real- 
life examples, it discusses practical considerations for balancing performance, scalability, 
and reliability. 

Keywords: Oracle, automation, categorization, database, performance, reliability, 
scalability, web services 

8 Document adaptation: Supporting virtual documents in just-in-time hypermedia 
systems 

Li Zhang, Michael Bieber, David Millard, Vincent Oria 
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October 2004 Proceedings of the 2004 ACM symposium on Document engineering 
Publisher: ACM Press 

Full text available:^? ) pdf(707.51 KB) Additional Information: full citation , abstract , references , index terms 

Many analytical or computational applications especially legacy systems create documents 
and display screens in response to user queries "dynamically" or in "real time". These 
"virtual documents" do not exist in advance and thus hypermedia features must be 
generated "just in time" - automatically and dynamically. Additionally the hypermedia 
features may have to cause target documents to be generated or re-generated. This 
paper focuses on the specific challenges faced in hypermedia support for ... 

Keywords: dynamic hypermedia functionality, dynamic regeneration, integration 
architecture, just-in-time hypermedia, re-identification, re-location, virtual documents 




9 Record-boundary discovery in Web documents 

D. W. Embley, Y. Jiang, Y.-K. Ng 

June 1999 ACM SIGMOD Record , Proceedings of the 1999 ACM SIGMOD international 

conference on Management of data SIGMOD '99, volume 28 issue 2 
Publisher: ACM Press 

Full text available: f» 1pdf(l.36 MB) Additional Information: full citation, abstract , references, citings, index 

' " terms 

Extraction of information from unstructured or semi structured Web documents often 
requires a recognition and delimitation of records. (By "record" we mean a group of 
information relevant to some entity.) Without first chunking documents that contain 
multiple records according to record boundaries, extraction of record information will not 
likely succeed. In this paper we describe a heuristic approach to discovering record 
boundaries in Web documents. In our approach, we capture ... 

10 The Logical Record Access Approach to Database Design 

Toby J. Teorey, James P. Fry 

June 1980 ACM Computing Surveys (CSUR), Volume 12 issue 2 
Publisher: ACM Press 

Full text available: ^] pdf(2.81 MB) Additional Information: full citation , references , citings , index terms 







11 S pecial issue: Al in engineering 

D. Sriram, R. Joobbani 
April 1985 ACM SIGART Bulletin, issue 92 

Publisher: ACM Press 

Full text available: ^ pdf(8.79 MB) Additional Information: full citation , abstract 

The papers in this special issue were compiled from responses to the announcement in 
the July 1984 issue of the SIGART newsletter and notices posted over the ARPAnet. The 
interest being shown in this area is reflected in the sixty papers received from over six 
countries. About half the papers were received over the computer network. 

12 Application of OODB and SGML techniques in text database: an electronic dictionary 
<g> system 

Jian Zhang 

March 1995 ACM SIGMOD Record, Volume 24 issue l 
Publisher: ACM Press 

Full text available: ^ pdf(557.23 KB) Additional Information: full citation , abstract , citings, index terms 
An electronic dictionary system (EDS) is developed with object-oriented database 
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techniques based on ObjectStore. The EDS is composed of two parts: the Database 
Building Program (DBP), and the Database Querying Program (DQP). DBP reads in a 
dictionary encoded in SGML tags, and builds a database composed of a collection of trees 
which holds dictionary entries, and several lists which contain items of various lexical 
categories. With text exchangeability introduced by the SGML, DBP is able to acco ... 

Keywords: SGML, object-oriented databases, text database 



13 HyperFile: a data and query model for documents 

Chris Clifton, Hector Garcia-Molina, David Bloom 

January 1995 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 4 Issue 1 

Publisher: Springer-Verlag New York, Inc. 

Full text available: f£| pdf(2.04 MB) Additional Information: full citation , abstract , references , citings 




Non-quantitative information such as documents and pictures pose interesting new 
problems in the database world. Traditional data models and query languages do not 
provide appropriate support for this information. Such data are typically stored in file 
systems, which do not provide the security, integrity, or query features of database 
management systems. The hypertext model has emerged as a good interface to this 
information; however, finding information using hypertext browsing does not ... 

Keywords: hypertext, indexing, user interface 
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January 2000 Proceedings of the ACM conference on Document processing systems 
Publisher: ACM Press 

Full text available: ^ pdf(758.62 KB ) Additional Information: fu ll citation , references , citings , index terms 





1 5 Design of an OPAC database to permit d ifferent sub j ect searchin g accesses in a 
<||> multi-disciplines universities library catalogue database 

Maristella Agosti, Maurizio Masotti 

June 1992 Proceedings of the 15th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available: ^ pdf(877.96 KB ) Additional Information: full citation , abstract, references , index terms 

This paper presents searching approaches and user interface capabilities of DUO, an 
Online Public Access Catalogue (OPAC) designed to permit the users of three Universities 
of the Northeast of Italy different subject searching accesses to the co-operative multi- 
disciplines library catalogue database. The co-operative catalogue database is managed 
by one of the software systems developed under the italian national project for library 
automation: the SBN project. Since the SBN data ... 

16 Gaze: EyePrint: support of document browsin g with ey e g az e trace 

Takehiko Ohno 

October 2004 Proceedings of the 6th international conference on Multimodal 

interfaces 
Publisher: ACM Press 

Full text available: f£\ pdf(534.23 KB) Additional Information: full citation , abstract , references , index terms 





Current digital documents provide few traces to help user browsing. This makes document 
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browsing difficult, and we sometimes feel it is hard to keep track of all of the information. 
To overcome this problem, this paper proposes a method of creating traces on digital 
documents. The method, called EyePrint, generates a trace from the user's eye gaze in 
order to support the browsing of digital document. Traces are presented as highlighted 
areas on a document, which become visual cues for accessi ... 

Keywords: document browsing, eyePrint, gaze-based interaction, information retrieval, 
readwear, reusability problem 
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1 8 PERSIVAL a system for personalized search and summarization over multimedia 
healthcare information 



Kathleen R. McKeown, Shih-Fu Chang, James Cimino, Steven Feiner, Carol Friedman, Luis 
Gravano, Vasileios Hatzivassiloglou, Steven Johnson, Desmond A. Jordan, Judith L. Klavans, 
Andre Kushniruk, Vimla Patel, Simone Teufel 

January 2001 Proceedings of the 1st ACM/IEEE-CS joint conference on Digital 

libraries 
Publisher: ACM Press 

Full text available: fR pdf(369.13 KB ) Additional Information: full citation , abstract, references , citings, index 

terms 

In healthcare settings, patients need access to online information tha t can help them 
understand their medical situation. Physicians need information that is clinically relevant 
to an individual patient. In this paper, we present our progress on developing a system, 
PERSIVAL, that is designed to provide personalized access to a distributed patient care 
digital library. Using the secure, online patient records at New York Presbyterian Hospital 
as a user model, PERSIVAL's components tailor s ... 

Keywords: medical digital library, multimedia, natural language, personalization, query 
interface, search, summarization 
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Traditional document systems use hierarchical filing structures as the basis for organizing, 
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storing and retrieving documents. However, this structure is very limited in comparison 
with the rich and varied forms of document interaction and category management in 
everyday document use. Presto is a prototype document management system providing 
rich interaction with documents through meaningful, user-level document attributes, such 
as "Word file," "published paper/' &l ... 

Keywords: attribute/value systems, direct manipulation, document management 
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An exam ple-based mapping method for text cate g orization and retrieval 

Yiming Yang, Christopher G. Chute 

July 1994 ACM Transactions on Information Systems (TOIS), Volume 12 issue 3 
Publisher: ACM Press 

Full text available- fiB pdf(1.78 MB) Additional Information: full citation , abstract , references , citings, index 

• \a±u terms 

A unified model for text categorization and text retrieval is introduced. We use a training 
set of manually categorized documents to learn word-category associations, and use 
these associations to predict the categories of arbitrary documents. Similarly, we use a 
training set of queries and their related documents to obtain empirical associations 
between query words and indexing terms of documents, and use these associations to 
predict the related documents of arbitrary queries. A Linear Le ... 

Keywords: document categorization, query categorization, statistical learning of human 
decisions 





Database theory, technology and a p plications (DTTA): Simplif ied access to 
structured databases by adapting keyword search and database selection 

Mohammad Hassan, Reda Alhajj, Mick J. Ridley, Ken Barker 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 

Publisher: ACM Press 

Full text available: ^ pdf(219.19 KB) Additional Information: full citation , abstract , references , index terms 

This paper presents a tool that enables non-technical (naive) end-users to use free-form 
queries in exploring distributed relational databases with simple and direct technique, in a 
fashion similar to using search engines to search text files on the web. This allows web 
designers and database developers to publish their databases for web browsers exploring. 
The proposed approach can be used for both Internet and Intranet application areas. Our 
approach depends on identifying first databases that ... 

Keywords: database selection, information retrieval, keyword search, relational 
databases 
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Research aimed at correcting words in text has focused on three progressively more 
difficult problems:(l) nonword error detection; (2) isolated-word error correction; and (3) 
context-dependent work correction. In response to the first problem, efficient pattern- 
matching and n-gram analysis techniques have been developed for detecting strings that 
do not appear in a given word list. In response to the second problem, a variety of 
general and application-specific spelling cor ... 

Keywords: n-gram analysis, Optical Character Recognition (OCR), context-dependent 
spelling correction, grammar checking, natural -language-processing models, neural net 
classifiers, spell checking, spelling error detection, spelling error patterns, statistical- 
language models, word recognition and correction 
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October 1999 ACM Transactions on Information Systems (TOIS), volume 17 issue 4 
Publisher: ACM Press 

Full text available: fQ pdf(328.98 KB) Additional Information: full citation, abstract, references , citings, index 
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Queries to text collections are resolved by ranking the documents in the collection and 
returning the highest-scoring documents to the user. An alternative retrieval method is to 
rank passages, that is, short fragments of documents, a strategy that can improve 
effectiveness and identify relevant material in documents that are too large for users to 
consider as a whole. However, ranking of passages can considerably increase retrieval 
costs. In this article we explore alternative query evalua ... 

Keywords: inverted files, passage retrieval, query evaluation, text databases, text 
retrieval 
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The following are citations selected by title and abstract as being related to Information 
Retrieval (IR), resulting from a computer search, using BRS Information Technologies, of 
the Dissertation Abstracts Online database produced by University Microfilms International 
(UMI). Included are UMI order number, title, author, degree, year, institution; number of 
pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen 
by the author, and abstract. Unless otherwise spec ... 

6 OAI a p plication: Extending SDARTS: extracting metadata from web databases and 
^ interfacing with the open archives initiative 

^ Panagiotis G. Ipeirotis, Tom Barry, Luis Gravano 

July 2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^ pdf(303.33 KB) Additional Information: full citation , abstract , references , index terms 
SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines 
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two complementary existing protocols, SDLIP and STARTS, to define a uniform interface 
that collections should support for searching and exporting metasearch-related metadata. 
S DARTS also includes a toolkit with wrappers that are easily customized to make both 
local and remote document collections SDARTS-compliant. This paper describes two 
significant ways in which we have extended the SDARTS toolkit. First, we ... 

Keywords: SDLIP, distributed searching, metadata, metasearching, web databases, 
wrapper construction 
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November 1997 Proceedings of the 1997 conference of the Centre for Adva need 

Studies on Collaborative research 

Publisher: IBM Press 

Full text available: fiflpdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 




Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 
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Full text available: ^| pdf(2,27 MB ) Additional Information: full citation, abstract 

The following are citations selected by title and abstract as being related to Information 
Retrieval (IR), resulting from a computer search, using Dialog Information Services, of 
the Dissertation Abstracts Online database produced by University Microfilms International 
(UMI). Included are UMI order number, title, author, degree, year, institution; number of 
pages, one or more Dissertation Abstracts International (DAI) subject descriptors chosen 
by the author, and abstract. Unless otherwise speci ... 
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August 1998 The VLDB Journal - The International Journal on Very Large Data 

Bases, Volume 7 Issue 3 

Publisher: Springer-Verlag New York, Inc. 

Full text available: fj£| pdf(281.37 KB) Additional Information: full citation , abstract , citings , index terms 





We explore how to organize large text databases hierarchically by topic to aid better 
searching, browsing and filtering. Many corpora, such as internet directories, digital 
libraries, and patent databases are manually organized into topic hierarchies, also called 
taxonomies. Similar to indices for relational data, taxonomies make search and access 
more efficient. However, the exponential growth in the volume of on-line textual 
information makes it nearly impossible to maintain such taxono ... 

10 Access methods for text 

_ Chris Faloutsos 

March 1985 ACM Computing Surveys (CSUR), volume 17 issue l 

Publisher: ACM Press 
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Full text available: ^gj pdf(2.59 MB) Additional Information: full citation , abstract , references , citings, index 

terms , review 

This paper compares text retrieval methods intended for office systems. The operational 
requirements of the office environment are discussed, and retrieval methods from 
database systems and from information retrieval systems are examined. We classify these 
methods and examine the most interesting representatives of each class. Attempts to 
speed up retrieval with special purpose hardware are also presented, and issues such as 
approximate string matching and compression are discussed. A quali ... 

11 A corpus analysis ap proach for automatic q uery ex pansion and its extension to 
<g> multiple databases 

^ Susan Gauch, Jianying Wang, Satya Mahesh Rachakonda 

July 1999 ACM Transactions on Information Systems (TOIS), Volume 17 issue 3 

Publisher: ACM Press 

Full text available: fi3 pdf(1 1 1 .47 KB) Additional Information: full citation , abstract , references , citings, index 

' ^ ' terms 

Searching online text collections can be both rewarding and frustrating. While valuable 
information can be found, typically many irrelevant documents are also retrieved, while 
many relevant ones are missed. Terminology mismatches between the user's query and 
document contents are a main cause of retrieval failures. Expanding a user's query with 
related words can improve search performances, but finding and using related words is an 
open problem. This research uses corpus analysis technique ... 

Keywords: query expansion 




12 Data integ ration using similarity j oins and a word-based information representation 
<|k language 

^ William W. Cohen 

July 2000 ACM Transactions on Information Systems (TOIS), Volume 18 issue 3 
Publisher: ACM Press 

Full text available* fip pdf(312 80 KB) Additional Information: full citation , abstract , references , citings, index 

: terms , review 

The integration of distributed, heterogeneous databases, such as those available on the 
World Wide Web, poses many problems. Herer we consider the problem of integrating 
data from sources that lack common object identifiers. A solution to this problem is 
proposed for databases that contain informal, natural-language "names" for objects; most 
Web-based databases satisfy this requirement, since they usually present their 
information to the end-user through a veneer of text. We des ... 

13 Efficient and effective metasearch for text databases incorporating linkages among 
documents 

Clement Yu, Weiyi Meng, Wensheng Wu, King-Lup Liu 

May 2001 ACM SIGMOD Record , Proceedings of the 2001 ACM SIGMOD international 

conference on Management of data SIGMOD '01, Volume 30 issue 2 
Publisher: ACM Press 

Full text available* pdf(245 22 KB) Adc| itional Information: full citation , abstract , references , citings, index 

^ 5 terms 

Linkages among documents have a significant impact on the importance of documents, as 
it can be argued that important documents are pointed to by many documents or by other 
important documents. Metasearch engines can be used to facilitate ordinary users for 
retrieving information from multiple local sources (text databases). There is a search 
engine associated with each database. In a large-scale metasearch engine, the contents 
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of each local database is represented by a representative. Each u ... 

Keywords: distributed collection, information retrieval, linkages among documents, 
metasearch 





14 Data structures for efficient broker implem entation 

Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas 
July 1997 ACM Transactions on Information Systems (TOIS), volume 15 issue 3 

Publisher: ACM Press 

Full text available- pdf(316 45 KB) Addit ' onal Information: full citation , abstract , references , citings, index 

^ : terms , review 

With the profusion of text databases on the Internet, it is becoming increasingly hard to 
find the most useful databases for a given query. To attack this problem, several existing 
and proposed systems employ brokers to direct user queries, using a local database of 
summary information about the available databases. This summary information must 
effectively distinguish relevant databases and must be compact while allowing efficient 
access. We offer evidence that one broker, GIOSS 

Keywords: GIOSS, broker architecture, broker performance, distributed information, grid 
files, partitioned hashing 



15 Experiences with selecting search engines using metasearch 

|k Daniel Dreilinger, Adele E. Howe 

July 1997 ACM Transactions on Information Systems (TOIS), Volume 15 issue 3 

Publisher: ACM Press 

Full text available- ^fi pdf(428 65 KB) Additional Information: full citation , abstract , references , citings , index 

: terms , review 

Search engines are among the most useful and high-profile resources on the Internet. 
The problem of finding information on the Internet has been replaced with the problem of 
knowing where search engines are, what they are designed to retrieve, and how to use 
them. This article describes and evaluates SavvySearch, a metasearch engine designed to 
intelligently select and interface with multiple remote search engines. The primary 
metasearch issue examined is the importance of carefully selecti ... 

Keywords: WWW, information retrieval, machine learning, search engine 
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June 1998 ACM SIGMOD Record , Proceedings of the 1998 ACM SIGMOD international 

conference on Management of data SIGMOD '98, volume 27 issue 2 
Publisher: ACM Press 

Full text available- f?) pdf(1.83 MB) Additional Information: full titation, abstract , re ferences , citings, index 

terms 

Most databases contain "name constants" like course numbers, personal names, and 
place names that correspond to entities in the real world. Previous work in integration of 
heterogeneous databases has assumed that local name constants can be mapped into an 
appropriate global domain by normalization. However, in many cases, this assumption 
does not hold; determining if two name constants should be considered identical can 
require detailed knowledge of the world, the purpose of the ... 

17 Research session: text data management: The TEXTURE benchmark: measurin g 
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August 2005 Proceedings of the 31st international conference on Very large data 

bases VLDB '05 

Publisher: VLDB Endowment 
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We introduce a benchmark called TEXTURE (TEXT Under REIations) to measure the 
relative strengths and weaknesses of combining text processing with a relational workload 
in an RDBMS. While the well-known TREC benchmarks focus on quality, we focus on 
efficiency. TEXTURE is a micro-benchmark for query workloads, and considers two central 
text support issues that previous benchmarks did not: (1) queries with relevance ranking, 
rather than those that just compute all answers, and (2) a richer mix of t ... 

18 Implementing SMART for minicomputers via relational processing With abstract data 
<§► types 

^ Edward A. Fox 

October 1981 Proceedings of the 1981 ACM SIGSM ALL symposium on Small systems 

and SIGMOD workshop on Small database systems 

Publisher: ACM Press 

Full text available: S pdf (948,46 KB) Addltional Information: full citation , abstract, references , citings, index 

" ' terms 

Designed during the 1960's as a research tool for the field of information retrieval, the 
SMART system has been operating on an IBM 370 since 1974. SMART is now being 
enhanced, redesigned, and programmed under the UNIX operating system [28] on a DEC 
VAX 11/780. The techniques used should allow real-time operation on smaller 
minicomputers in the PDP 11 family. The implementation provides for a combination of 
database and information retrieval operations which make it applicable to office aut ... 

19 Search improvement via automatic query reformulation 

Susan Gauch, John B. Smith 

July 1991 ACM Transactions on Information Systems (TOIS), volume 9 issue 3 
Publisher: ACM Press 

Full text available: fi3 pdf(2.28 MB) Add 't«onal Information: full citation , references , citings , index terms . 
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Keywords: Expert Systems, full-text information retrieval, online search assistance, 
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June 2005 Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^ pdf(278.40 KB) Additional Information: full citation , abstract , references , index terms 

An ever-increasing amount of information on the Web today is available only through 
search interfaces: the users have to type in a set of keywords in a search form in order to 
access the pages from certain Web sites. These pages are often referred to as the Hidden 
Web or the Deep Web, Since there are no static links to the Hidden Web pages, search 
engines cannot discover and index such pages and thus do not return them in the results. 
However, according to recent studies, the conte ... 
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Bioinformatics Conference. 2003. CSB 2003. Proceedings of the 2003 IEEE 

11-14 Aug. 2003 Page(s):644 - 645 

Digital Object Identifier 10.1 109/CSB.2003.1227432 
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IFSA World Congress and 20th NAFIPS international Conference. 2001. Joint 
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in order to create hierarchical clusterings of Web documents. Unlike most clusl 
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Document Analysis and Recognition. 1993.. Proceedings of the Second Intern; 
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Summary: Document image archives are increasingly used to replace paper a 

filing. Usually those archives are combined with a database management syste 
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Summary: Many approaches have reported that knowledge-based layout recc 
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Digital Object Identifier 10.1109/38.595271 

Summary: To find a document in the sea of information, you must embark on 
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Knowledge and Data Engineering. IEEE Transactions on 
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AbstractPlus | References | Full Text: PDF(3707 KB) IEEE JNL 
Rights and Permissions 

□ 12. An efficient algorithm to compute differences between structured docum 

Kyong-Ho Lee; Yoon-Chul Choy; Sung-Bae Cho; 
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Digital Object Identifier 10.1 109/WI. 2005. 158 
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Summary: An investigation into relationships among search methods, topic tyi 
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hierarchy of documents built based on the keywords of the documents. To cov 
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Faiola, A.; Groth, D.P.; 

Information Visualisation. 2005. Proceedings. Ninth International Conference c 

6-8 July 2005 Page(s):613 - 618 

Digital Object Identifier 10.1 109/1 V.2005.47 

Summary: Most users today have difficulty locating files due to memory, i.e., i 
recreate the personal history of where they placed the document in the first pic 
search provides the standard file search functions according to human 
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Systems. Man and Cybernetics. 2004 IEEE International Conference on 

Volume 4, 1 0-1 3 Oct. 2004 Page(s):361 9 - 3624 vol.4 
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Summary: Bayesian networks are directed acyclic graphs that model the dep< 
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Summary: In this paper we propose a novel method for multimedia semantic i 
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Summary: Most commercial SQL database systems support user-defined fun- 
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Data Engineering. 2003. Proceedings. 19th International Conference on 
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Summary: XML has become the de facto standard format for Web publishing 
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between documents. The first method uses the Patricia tree, constructed from 
document, and the similarity is computed searching the text of each candidate 
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